Fix lumina2 pad token shape mismatch for some GGUF conversions#392
Fix lumina2 pad token shape mismatch for some GGUF conversions#392vaclavmuller wants to merge 1 commit intocity96:mainfrom
Conversation
|
thank you! this fixed the issue I was having |
|
I'd also like this merged. |
|
Thank you! That solved my problem. |
|
@city96 Is there anything else you need done here? |
|
I don't pretend to understand all of what is going on here, but I've found that this only causes an error if these two layers are in BF16; if they are F32 or F16 they seem to work fine. Not sure why that is. Or, I may be experiencing an unrelated issue. |
|
this should be on the main. |
|
While this fixes loading for zimage turbo, I am getting similar errors for zimage base. z_image_base_Q8_0.gguf: z_image_base_BF16.gguf |
This PR fixes a shape mismatch when loading some lumina2 / NextDiT GGUF models
(e.g. Z-Image Turbo GGUF builds).
Some GGUF conversions store
x_pad_tokenandcap_pad_tokenas 1D vectors([D]) instead of the expected 2D shape ([1, D]), which causes
load_state_dictto fail.The loader now:
orig_shapemetadata is missingTested with:
https://huggingface.co/leejet/Z-Image-Turbo-GGUF
Addresses #379